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(54) IVIethod and apparatus for providing for notification of task termination in an information 
handling system 



(57) A method and apparatus for ensuring that a 
process interacting with a failing process is notified of 
the failure of that process. Each process has a unique 
process identifier (RID) associated with it. Each process 
optionally has an affinity list containing one or more en- 
tries, each of which contains the identifier of a process 
that is to be notified when the process fails. A process 
updates the affinity list of a target process (either itself 
or another process) by calling an affinity service of the 
operating system (OS) kernel, specifying the type of op- 
eration (add or delete), the identifier of the target proc- 
ess, the identifier of the process that is to notified, and 
the type of event that is to be generated for the process 
that is to be notified. When a process fails, a process 
termination service of the OS kernel examines the affin- 
ity list of the failing process and, for each entry In the 
list, generates an event of the specified type for the proc- 
ess specified as to be notified. 
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Description 

[0001] This invention relates to a nnethod and appa- 
ratus for providing for notification of task termination 
and, more particularly, to a method and apparatus for 
providing for notification of process termination in a cli- 
ent/server system. 

[0002] Client/server computing systems are well 
known in the art. In a client/server system, a client proc- 
ess (or simply ''client") issues a request to a server proc- 
ess (or simply "server"), either on the same system or 
on a different system, to perform a specified service. Up- 
on receiving the request, the server process performs 
the requested service and returns the result in a re- 
sponse to the client process. 

[0003] When creating a client/server application on a 
single system, there is frequently a need for a client to 
communicate requests to a server and to wait for the 
server to respond. Similarly, there can be multiple serve r 
processes that need to communicate with multiple client 
processes. If a client is waiting for a response from a 
server and the server terminates, the client process may 
hang in a wait until a user or operator makes a request 
to terminate the client process. Similarly, a server may 
be waiting for a response from a client and have the cli- 
ent terminate. Both client and server can add timer calls 
into their logic to cause the wait to time out, but this can 
cause unnecessary path length and requires the client 
or server application to pick a suitable time period. 
[0004] In UNlX<g)-based systems, there are several 
programming constructs that can be used to keep track 
of the connection between multiple processes. If an ap- 
plication uses a fork() or spawn () service to create a 
child process, then the two processes are tied together 
by the UNIX framework. That is, If the child process ter- 
minates, the parent process is sent a SIGCHLD signal. 
If the parent process terminates, the child process is 
sent a SIGH UP signal. However, since interacting serv- 
er and client processes are usually not bound together 
by this parent-child relationship, this mechanism is of 
little use as a general notification mechanism in UNIX- 
based systems. 

[0005] According to one aspect of the invention there 
is provided a method of providing for notification of task 
termination in an information handling system having a 
plurality of interacting tasks, the method comprising the 
steps of: defining for each of one or more target tasks 
an affinity list containing one or more entries for other 
tasks that are to be notified on termination of the target 
task; in response to receiving an affinity request speci- 
fying a target task and another task, adding an entry for 
the other task to an affinity list defined for the target task; 
and in response to detecting a termination of a target 
task, notifying each other task contained in the affinity 
list defined for the target task. 

[0006] According to a second aspect of the invention 
there is provided apparatus for providing for the notifi- 
cation of task termination in an information handling sys- 



tem having a plurality of interacting tasks, the apparatus 
comprising: means for defining for each of one or more 
target tasks an affinity list containing one or more entries 
for other tasks that are to be notified on termination of 
s the target task; means responsive to receiving an affinity 
request specifying a target task and another task for 
adding an entry for the other task to an affinity list de- 
fined for the target task; and means responsive to de- 
tecting a termination of a target task for notifying each 
other task contairied in the affinity list defined for the tar- 
get task. 

[0007] According to a third aspect of the invention 
there is provided a computer program element compris- 
ing computer program code means executable by the 
computer to: define for each of one or more target tasks 
an affinity list containing one or more entries for other 
tasks that are to be notified on termination of the target 
task; in response to receiving an affinity request speci- 
fying a target task and another task, add an entry for the 
other task to an affinity list defined for the target task; 
and in response to detecting a termination of a target 
task, notify each other task contained in the affinity list 
defined for the target task. 

[0008] Thus asolution to the aforementioned problem 
is provided by the Did affinity sen^ice of the present in- 
vention, described below. The term pid stands for proc- 
ess id. Both the server and client processes have unique 
PI Ds. The pid affinity service is used to create an affinity 
or bond between the client and server process, such that 
when one of them terminates, a mechanism is provided 
to drive a signal to notify the other waiting process. 
[0009] As will be described below, each process in the 
operating systenn optionally has a pid affinity list that 
identifies processes that wish to be notified (via signal) 
when the process terminates. The pid affinity service 
provides the mechanism for a client to add its pid to a 
sen/er s pid affinity list or for the sen/er to add its pid to 
the client s pid affinity list. It is up to an application to 
detemnine which processes use the pid affinity service. 
[0010] As an example of the operation of the present 
invention, suppose that a client is about to make a re- 
quest to a sender using a message queue. Prior to plac- 
ing the request on the server input queue (by issuing a 
msgsnd system call), the client calls the pid affinity serv- 
ice to add its pid to the pid affinity list of the sen/er. The 
client then issues a msgsnd system call to place the re- 
quest on the server input queue. The client then issues 
a msgrcv system cali to wait for a response from the 
server. While in this message queue wait, the server 
may terminate. If this happens, the kernel will see the 
pid affinity list and send a signal to each process (rep- 
resented by a pid) that is on the pid affinity list. The signal 
will wake up the client process from the msgrcv wait and 
allow it to fail the current request and return control to 
the calling process. 

[0011] The above description of a client/server com- 
munication using message queues is simply one exam- 
ple of how processes may communicate. They could a\- 
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so use shared memory, semaphores or any other com- 
munication mechanism. This example also described 
the client and server as simple single-threaded process- 
es. It is possible for a server to be multithreaded and 
handling many requests concurrently from multiple cli- 
ents. If such a server were to terminate, It would cause 
the notification of all the clients in its pid affinity list. 
[0012] A preferred embodiment of the Invention will 
now be described, by way of example only, with refer- 
ence to the accompanying drawings in which: 

Fig. 1 shows the parameter list passed to the pid 
affinity service, a process information block and the 
pid affinity list; 

Fig. 2 shows the flow and logic for adding another 
process PID to its own PID affinity list and the result 
of termination of the calling process; 

Fig. 3 shows the flow and logic for adding the caller 
s PID to the PID affinity list of a target process and 
the actions triggered by the termination of that tar- 
get process; 

Fig. 4 shows the entry to the PID affinity sen/ice as 
well as the delete entry processing; 

Fig. 5 shows the add entry logic of the PID affinity 
service. 

[0013] Referring first to Figs. 2-3, an embodiment of 
the present invention contains a PID affinity service 221 
in the kernel address space 204 of a system also having 
one or more user address spaces 202 including proc- 
esses 206 (process A) and 216 {process B). Kernel ad- 
dress space 204 is part of an operating system (OS) ker- 
nel (not separately shown) running, together with one 
or more user programs in user address spaces 202, on 
a general-purpose computer having a central process- 
ing unit (CPU), main and secondary storage, and vari- 
ous peripheral devices that are conventional in the art 
and therefore not shown. Although the present invention 
is not limited to any particular hardware or software plat- 
form, a preferred embodiment may be implemented as 
part of the IBM® OS/390® operating system, running 
on an 1 BM S/3904.22® processor such as an S/390 Par- 
allel Enterprise Server™ G4 or G5 processor. 
[0014] Referring now to Fig. 1, each process in the 
system has a process information block (PIB) 114 asso- 
ciated with it. Each PIB 114 contains a process id (PID) 
116 uniquely identifying the process and a pointer 118 
to a PID affinity list (PAL) 1 20, as well as other items that 
are not related to the present invention and are therefore 
not shown. For each call to the pid affinity service 221 
to add a PID to the list 120, an entry 122 is made in the 
list. Each entry 122 contains the PID 124 of the process 
to be notified of an event and an event type 126, which 
could be a signal number. 



[0015] A pid affinity parameter list (PL) 100 contains 
the parameters specified by an application program as 
input to the pid affinity sen/ice 221, as well as output 
from the pid affinity service 221. These parameters in- 

5 elude a function code 1 02, a target process parameter 
104, an event process parameter 106, an event param- 
eter 108 and a return code 110. Parameters 102-108 
are input parameters supplied by the calling application 
to the PID affinity service 221, while return code 110 is 

10 an output parameter returned by the PID affinity service 
221 to the calling application. 

[0016] The function code 102 specifies which pid af- 
finity service function is requested by the application 
program. Supported function codes 102 are adding an 
IS entry 122toanaffinity list 120 and deleting an entry 122 
from an affinity list 1 20. 

[0017] The target process parameter 104 specifies 
the target process (as identified by its PID) whose affin- 
ity list 1 20 is the target of the operation specified by the 

20 function code parameter 102. 

[0018] The event process parameter 106 specified 
has different uses based upon the function code param- 
eter 102 specified. The event process 106 identifies the 
process that is to be delivered the event when the target 

2S process terminates. When an application specifies a 
function code 102 to add an entry 122 to an affinity list 
1 20, the contents of this parameter 106 are copied into 
an entry 1 24 in the affinity list 1 20 of the process spec- 
ified by the target process parameter 104. When an ap- 

30 plication specifies the function code parameter 102 to 
delete an entry 1 22 from an affinity list 1 20, the contents 
of this parameter 1 06 are compared with existing entries 
1 24 in the affinity list 120 of the process specified by the 
target process parameter 104. If an entry 122 with a 

35 matching process identifier 124 is found, it is cleared 
and is available to be reused. 

[0019] The event parameter 108 specifies the event 
126 to be generated when the target process 104 ter- 
minates. This parameter 108 is unused when the func- 

40 tion code parameter 102 requests deletion of an entry 
122. When the function code parameter 102 specifies 
adding an entry 1 22 to an affinity list 1 20, the contents 
of this parameter 1 08 are copied to an entry 126 in the 
affinity list 120 of the process specified by the target 

45 process parameter 104. 

[0020] The fifth parameter 110 contains the return 
code generated by the pid affinity service. It is used to 
indicate the success or failure of the pid affinity service 
to the application program. 

50 [0021] Fig. 2 shows the usage of the pid affinity serv- 
ice 221 when a client program adds its PID to the pid 
affinity list 120 of a server Fig. 2 shows user address 
spaces 202 and a kernel address space 204. The kernel 
address space 204 is where services are provided that 

55 allow applications to communicate with other user ad- 
dress spaces 202. In this example, user address spaces 
202 include a client address space 206 (process A) that 
is communicating with a server address space 216 
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(process B). 

[0022] The client 206 initially assigns a work request 
to the server 216 (step 208). As discussed earlier, one 
means of doing this is by placing a message on a mes- 
sage queue. After assigning the work request at step 5 
208, the client 206 waits for a response from the server 
2 1 6 by invoking a wait function 21 2 in the kernel address 
space 204 (step 210). The wait function 21 2 could be a 
general-purpose wait function, or it could be a function 
like msgrcv that waits for a message or a signal. This is io 
standard programming practice on UNIX systems. As 
described, for example, in w. R. Stevens, UNIX Network 
Programming, 1990. pages 126-137, incorporated 
herein by reference, in a UNI X system a process wishing 
to send a message to another process may issue a msg- ^5 
snd system call to place a message in a message 
queue. That other process may in turn issue a msgrcv 
system call to retrieve the message from the message 
queue. 

[0023] In the server space 216, shown as process B, 20 
the server receives the work request from the client 206 
(step 218). This could be accomplished using a function 
like msgrcv to receive a message placed on a message 
queue by the client 206 at step 208. After receiving the 
work request at step 218, the server 216 calls the pid 2S 
affinity service (pid_affinity) 221 of the present invention 
with a function code 102 to add, a target process PID 
104 of process B (itself), an event process 106 of proc- 
ess A, and an event 108 which could be a particular sig- 
nal (step 220). Once this step is completed, should an- 30 
ything happen to terminate the server process 21 6, the 
client process 206 is guaranteed to be notified with the 
requested event 108. 

[0024] Next, server 216 processes the work request 
assigned at step 208 (step 220). Assuming no errors oc- 35 
cur, the server 216 processes the work request (step 
222) and then notifies the client 206 of the completion 
(step 226). This could be accomplished by sending a 
message to the client 206 with the results of the work 
request. The msgsnd by the server 216 would wake up 40 
the client 206 in a msgrcv wait 21 2. After notifying client 
process 206 at step 226, the server 216 calls the pid 
affinity service 221 with function code 102 to delete an 
entry 122 in the PID affinity list 120 for a target process 
104 set to process B 216 (itself) (step 228). The event 
process 1 06 Is set to process A 206. After the pid affinity 
service 221 completes the request, the entry 122 for the 
client process 206 is removed from the PID affinity list 
120 for the server process 216. 

[0025] During this server processing, suppose a ter- so 
minating event 224 occurs, which prevents the server 
from completing the work request at step 226. In this 
case, the kernel 204 gets control in process termination 
230. As part of process termination 230, the kernel han- 
dles any entries 122 in the PID affinity list 120 for the ss 
terminating process 216. If an entry in the PID affinity 
list 120 is filled in (step 232), then the kernel generates 
the event 126 and targets this event to the PID 124 in 



the entry 1 22 of the PID affinity list 120 (step 234). 
[0026] The generation ot the event at step 234 causes 
the target process 206 to be resumed from its wait con- 
dition 21 2 (step 236) and triggers the delivery of the ab- 
normal event 126 to an event exit 238 of process A 206. 
The client code in the event exit 238 is notified of the 
termination of the server 216 (process B) from which it 
was awaiting a response (step 240). The client event 
exit 238 can then decide whether to terminate or retry 
the request. What the client does when notified is not 
part of the present invention and is therefore not de- 
scribed.. 

[0027] Fig. 3 shows another model supported by the 
pid affinity service 221 . In this case, a client process 302 
(process C) determines the RID of a server process 320 
(process D) with which it will soon communicate. This 
may be accomplished with shared memory, configura- 
tion files or other means not related to the present in- 
vention. The client process 302 then calls the pid affinity 
service 221 with a function code 102 of add, a target 
process 104 PID for process D 320 (the server), an 
event process 1 06 set to process C 302 (the client, itself) 
and the event 108 it wishes to receive if the server 320 
terminates while processing its request (step 304). 
[0028] The client 302 then assigns work to the server 
process 320 via a message queue or other communica- 
tion mechanism (step 306). The client 302 then calls the 
wait service 212 to wait for a response from the server 
320 (step 308). The wait service 21 2 puts the client 302 
to sleep until the requested function completes or an ab- 
normal event is received. 

[0029] In the meantime, the sender 320 has received 
the work request (step 322) and is processing the work 
(step 324). If all works successfully, the server 320 no- 
tifies the client 302 when the work completes (step 328). 
This notification at step 328 causes the client process 
302 to exit the wait function 212 with a successful return 
code. Upon receiving control back from wait, the client 
302 calls the pid affinity service 221 to undo the call 
made at step 304 (step 310). This call at step 310 will 
set the function code 102 to request delete, the target 
process 104 will identify server process D 320 and the 
event process 106 will identify this client 302. 
[0030] If a terminating event 326 hits the server 320, 
then it will trigger the process termination service 
(processjerm) 230. Process termination service 230 
will run through the PID affinity list 1 20 for server proc- 
ess D 320 and for each entry in the PI D affinity list (step 
232), it will generate 224 the requested event 1 26 to the 
target PID 124 (step 234). In this case, the target PID 
1 24 identifies client process C 302 and the event 1 26 is 
what was passed in the event parameter 108 in step 
304. 

[0031] When the event is generated at step 234, it 
causes client process C 302 to be taken out of the wait 
212 with an interrupt (step 340). The wait function 212, 
instead of returning to the caller after step 308, now 
passes control to the event exit 311 . The event exit 31 1 
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is notified of the termination of server process D (step 
312). At this point, the client code 302 can either termi- 
nate, retry the request or request a different service. 
[0032] Fig. 4 shows the processing of the PID affinity 
service 221 . On entry, the service 221 validates the call- 
er s parameters (step 402). If the function code 102, tar- 
get process PID 104, event process PID 106, or event 
1 08 is invalid, then the service 221 sets a unique failing 
return code (step 404) and returns to the caller (step 
406). Assuming all parameters are valid, the service 22 1 
obtains a process lock for the target process 104 (step 
408). This lock serializes updates to the PID affinity list 
120 (here after referred to as PAL) of the target process 
104 for multiple callers. 

[0033] If the target process 104 does not yet have a 
PAL 1 20 (step 41 0), then storage is obtained for the PAL 
1 20 and the location of the PAL 1 20 is stored In the Proc- 
ess Information Block (PIB) 114 in field 116 (step 412). 
Next the function code 102 is tested to determine wheth- 
er add or delete processing is requested (step 416). If 
add processing is requested, processing is as described 
in Fig. 5 (step 418). 

[0034] For delete processing, the PAL 1 20 is scanned 
for an entry 122 that has a PID 124 that matches the 
event process PID 106 passed as Input (step 414). If a 
matching entry 122 is found (step 420), then the entry 
122 is cleared and the last entry 1 22 in the PAL 120 is 
moved to the cleared entry to keep the table packed 
(step 422). The process lock is then released and con- 
trol is returned to the caller (step 406). If the entry 122 
is not found, then the process lock is released and con- 
trol is returned to the caller without performing the de- 
letion step 422 (step 406). 

[0035] Fig. 5 shows the processing to add an entry to 
the PAL 120. The target process PID 104 is tested (step 
502) to determine if it is the same as the caller s PID 
116. If they match, it means that if the calling process 
terminates, it will cause a signal (event 1 08) to be sent 
to the event process 106. Before adding the entry 122 
to the PAL 1 20, a test is made to determine if the calling 
process is allowed to send a signal (event 108) to the 
event process 1 06 (step 504). If the caller is not permit- 
ted to send the signal (event 1 08), then the service sets 
an error code (step 508), releases the process lock and 
returns to the caller (step 518). 
[0036] Once past the initial tests, the code loops 
through the PAL 1 20 (step 506). Looking at an entry 1 22 
in the PAL 120, if the current PID 124 is the same as the 
event PID 108 (step 51 0), then this entry 122 is overlaid 
by storing the event pid 106 over the PID in the entry 
124 and the event 108 over the event 126 in the entry 
1 22 (step 512). If the PIDs don t match at step 510, then 
if there are more entries in the PAL 120 (step 514), the 
loop continues at step 506. 

[0037] If the event PID 106 is not found in the PAL, 
then a new entry 122 is chosen. This will normally just 
use the next unused entry 1 22 in the PAL 1 20. If the PAL 
1 20 is full, a new larger PAL is obtained, the old PAL 1 20 



is copied into the new PAL and the address of the new 
PAL is stored in the PIB 114 in field 116. Since the proc- 
ess is locked (step 408), this can be done safely After 
copying the old PAL to the new PAL, the old PAL is freed. 

s The new entry is then stored as in step 512 using an 
unused entry 122 in the PAL. The process lock is re- 
leased and control is returned to the caller (step 518). 
[0038] Although a particular embodiment of the inven- 
tion has been shown and described, various modifica- 

10 tions and extensions within the scope of the appended 
claims will be apparent to those skilled in the art. 



Claims 

IS 

1 . A method of providing for notification of task termi- 
nation in an Information handling system having a 
plurality of interacting tasks, the method comprising 
the steps of: 

20 

defining for each of one or more target tasks an 
affinity list containing one or more entries for 
other tasks that are to be notified on termination 
of the target task; 

25 

in response to receiving an affinity request 
specifying a target task and another task, add- 
ing an entry for the other task to an affinity list 
defined for the target task; and 

30 

in response to detecting a termination of a tar- 
get task, notifying each other task contained in 
the affinity list defined for the target task. 

35 2. The method of claim 1 in which the affinity request 
originates from the target task. 

3. The method of claim 1 in which the affinity request 
originates from the other task. 

40 

4. The method of claim 1 in which the affinity request 
is of a first type, the method comprising the further 
step of: 

in response to receiving an affinity request of 
45 a second type specifying a target task and another 
task, deleting an entry for the other task from the 
affinity list defined for the target task. 

5. The method of any preceding claim in which the 
50 adding step comprises the steps of: 

determining whether an affinity list is already 
defined for the target task; 

55 if an affinity list is already defined for the target 

task, adding an entry for the other task to the 
affinity list defined for the target task; and 
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8. 



if an affinity list is not already defined for the 
target task, defining an affinity list for the target 
task and adding an entry for the other task to 
the affinity list defined for the target task. 

The method of any preceding claim in which the 
tasks are processes having separate address spac- 



The method of claim 1 in which the tasks are user 
tasks and the steps are performed by an operating 
system kernel. 

The method of claim 1 in which the affinity request 
specifies a type of operation to be performed on the 
affinity list defined for the target process. 



10 



15 



and 

means for defining an affinity list for the target 
task and adding an entry for the other task to 
the affinity list defined for the target task if an 
affinity list Is not already defined for the target 
task. 

1 3. A computer program element comprising computer 
program code means executable by the computer 
to: 

define for each of one or more target tasks an 
affinity list containing one or more entries for 
other tasks that are to be notified on termination 
of the target task; 



9. The method of claim 1 in which the affinity request 
specifies an event to be generated for the other task 
upon termination of the target task. 

10. Apparatus for providing for the notification of task 
termination in an information handling system hav- 
ing a plurality of interacting tasks, the apparatus 
comprising: 

means for defining for each of one or more tar- 
get tasks an affinity list containing one or more 
entries for other tasks that are to be notified on 
termination of the target task; 

means responsive to receiving an affinity re- 
quest specifying a target task and another task 
for adding an entry for the other task to an af- 
finity list defined for the target task; and 

means responsive to detecting a termination of 
a target task for notifying each other task con- 
tained in the affinity list defined for the target 
task. 



in response to receiving an affinity request 
specifying a target task and another task, add 
20 an entry for the other task to an affinity list de- 

fined for the target task; and 

in response to detecting a termination of a tar- 
get task, notify each other task contained in the 
25 affinity list defined for the target task. 

14. The computer program element of claim 13 in which 
the affinity request is of a first type, further compris- 
ing computer program code means executable by 

30 the computer to: 

in response to receiving an affinity request of 
a second type specifying a target task and another 
task, delete an entry for the other task from the af- 
finity list defined for the target task. 

35 

1 5. The computer program element of claim 1 3 or claim 
14 embodied on a computer readable medium. 



40 



11. The apparatus of claim 10 in which the affinity re- 
quest is of a first type, the apparatus further com- 
prising: 

means responsive to receiving an affinity re- 45 
quest of a second type specifying a target task and 
another task for deleting an entry for the other task 
from the affinity list defined for the target task. 



12. The apparatus of claim 1 in which the adding means 50 
comprises: 



means for determining whether an affinity list is 
already defined for the target task; 

means for adding an entry for the other task to 
the affinity list defined for the target task if an 
affinity list is already defined for the target task; 
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